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ABSTRACT 

We establish, for various scenarios, whether or not interruptible exact stationary 
sampling is possible when a finite-state Markov chain can only be viewed passively. In 
particular, we prove that such sampling is not possible using a single copy of the chain. 
Such sampling is possible when enough copies of the chain are available, and we provide 
an algorithm that terminates with probability one. 
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1 Introduction and summary 

In recent years a large number of articles have been written about exact sampling 
(also called perfect sampling) using Markov chains. See |13] for an overview. The 
rough idea is as follows. One wishes to sample from the unique stationary distri- 
bution vr of an observed irreducible Markov chain. At each transition of the chain, 
a decision is made whether to continue observing the chain or to stop. When the 
observation is stopped, a value S is output and it is desired that, for all states i, 
P{S = i I one ever stops observing the chain) = vTj. The decision about whether to stop 
at a particular time is made on the basis of the evolution of the chain up through that 
time, possibly together with some additional randomness independent of the chain. 

The goal of our research leading to this paper was to determine whether or not it is 
possible to carry out interruptible exact sampling for finite-state chains in what Propp 



and Wilson |12] call the passive setting. (We will explain in Section 3^ what is meant 



by "interruptible" and "the passive setting".) Our central result is the following: 

Interruptible exact sampling is not possible when one observes only a single trajectory. 

This result remains true even if we assume that the chain is aperiodic and reversible. 
[See Remark |6.2| (b).] However, interruptible exact sampling is possible for an A^-state 
chain when one is able to observe, simultaneously, N trajectories. Here is a guide to 
our specific results. 



(i) (positive:) We provide an algorithm (Algorithm [4.3| ) which, given an irreducible 
Markov chain on A^ states as input, produces in (random) finite time an exact 
sample from the tree distribution, and hence also an exact sample from vr. (The 
tree distribution is defined in Section |3.l| .) The algorithm is interruptible, but 
requires A^ independent synchronized trajectories from the chain. (See Theo- 
rem 



(ii) (negative:) There is no algorithm in the passive setting for obtaining an obser- 
vation from the stationary distribution of an irreducible aperiodic Markov chain 
on N states which uses fewer than A^ independent trajectories from the chain and 
which is both interruptible and exact. (See Theorem ^H] .) 

(iii) (negative:) There is no algorithm in the passive setting for obtaining an observa- 
tion from the common stationary distribution of any finite number of independent 
irreducible aperiodic Markov chains on A^ states (with possibly different transition 
matrices) which is both interruptible and exact. (See Theorem ^^■) This remains 
true even if we assume that all of the chains are reversible. [See Remark |6^(a).] 



2 Background 

In 1992, Asmussen, Glynn, and Thorisson |Q] demonstrated that exact sampling from a 
Markov chain is possible under certain circumstances. They also proved that it is not 
possible to obtain an exact sample from an arbitrary Markov chain without some prior 
knowledge about the chain; in particular, the size of the state space must be known. 
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Although their paper does provide a method for generating exact samples from an A^- 
state Markov chain when N is known, the paper is primarily of a theoretical nature, 
and the method is complicated and inefficient. 

In 1995, Lovasz and Winkler [^] provided a simpler and more efficient algorithm 
for obtaining an exact sample from an irreducible A-state Markov chain. Although not 
mentioned explicitly in their paper, the method described in Section 3 of Lovasz and 
Winkler can in fact be used to obtain an exact sample from the tree distribution of 
the Markov chain (as defined in Section |3.1D. Aldous [ffl], Broder pj, and Propp and 



Wilson [12 1 also describe algorithms for sampling from the tree distribution. Propp 
and Wilson |12| discuss and compare these and other methods of sampling from the 
tree distribution, and from the stationary distribution. Their discussion includes con- 
sideration of such issues as whether or not the sampling is exact or interruptible. To 
our knowledge, the question of whether interruptible exact sampling is possible in the 
passive case (as described in Section |3.2[) has not previously been considered. 



3 Preliminaries 

3.1 The tree distribution 

Throughout this paper we consider only finite-state irreducible Markov chains. We 
assume that the number of states, call it A, is known; in fact, it turns out that we may 
as well assume (and so we do) that the state space is known to be [A] := {1, . . . , A}. 
We denote the transition matrix of such a chain generically by P = {pij ) . 

An irreducible Markov chain on [A"] can be viewed equivalently as a random walk 
on a connected weighted directed graph G. The vertex set of G is [A], and there is an 
edge from i to j, with weight pij, if and only if pij > 0. 

For the moment, let us consider an undirected graph G with vertex set [A^]. Then a 
subgraph T of G is called a spanning tree if it contains all A vertices and is connected 
and acyclic. From any spanning tree, we obtain a directed spanning tree by assigning a 
direction to each edge. A directed spanning tree is called an arhorescence rooted at a 
given vertex r if all edges are directed towards r. 

We define the weight w{T) of an arborescence T with edges {e;} as w{T) := 
Yih^^Pi^i)^ where p{ei) := pij if ei is directed from i to j. For the remainder of 
this paper, when we say "tree" we mean an arborescence T with w{T) > 0. The tree 
distribution of the Markov chain is the probability distribution on trees obtained by 
normalizing the weights w{T) so as to sum to unity. 

The Markov chain tree theorem is the well-known result (see, for example, |lO[| or ||2|) 
that the stationary distribution vr of the chain can be expressed simply in terms of the 
tree distribution: 

TTi = Wi/w, i G [N], 
where, writing % for the set of trees rooted at i and T for Ujg[7v]'?i, 



Wi := ^ w{T), w:= Wi = w{T). 
TeTi ie[N] TeT 
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In particular, any algorithm for sampling from the tree distribution provides a means 
of sampling from vr: simply output the root of the tree. 

3.2 The passive case; interruptible exact sampling 



Propp and Wilson |12] distinguish between the active setting and the passive setting for 
sampling using a Markov chain. In the active setting, an algorithm is assumed to have 
access at all times to a transition generator, that is, to a routine which, given any input 
state i, generates an observation j from the probability distribution (pij : j G [N]), 
independent of all previously generated observations. In particular, a user can generate 
a trajectory from P with any desired initial state. In the passive setting, the algorithm 
has no control over the initial state and can only watch passively as the chain transitions 
from one state to the next. 

We now explain what is meant by an (on-line, Markov-chain-based) interruptible 
exact sampling algorithm in the passive case; for simplicity, we will do this explicitly 
only in the case that a single trajectory from the chain is available and the desired out- 
put is an observation from the stationary distribution vr (rather than one from the tree 
distribution). Informally, an exact sampling algorithm must take as input a trajectory 
from the given Markov chain; possibly using external randomization to make its deci- 
sions, it watches the chain only until some finite time and then returns an observation 
distributed according to vr. (Important note: The state returned is not necessarily the 
state of the chain at the stopping time.) More formally, we can define an exact sampling 
algorithm as a collection of functions (j)k,i : [A^]'^"'"^ —)■ [0, 1] [with 4>k,i{xo, . . . ,Xk) to be 
interpreted informally as the conditional probability that the algorithm stops by time k 
and outputs i, given that it sees the trajectory (xq, • • • ,Xk) through time k] having 
the following properties, where (iii) and (iv) must hold for all vr, for all p, and for all 
irreducible transition matrices P = {pij) on [A^] with stationary distribution tt: 

(i) Vfc>0 V(xo,... ,Xfc) G [iVl'^+i : Zj'PkA^O,--- ,Xk)<l; 

(ii) Vi G [N] yk>0 V(xo,xi, . . . ) G [N]^ : 0fc,i(xo, • • • , a^fc) T as /c T; 

(iii) iimk^ooT,je[N]Y.xo,x^,...,XkP^oPxo,xi---Pxk-i,Xk(l^k,j{xo,xi,... ,Xk) > 0; 



(iv) V^ G [N] : limfc^oo p v r:^ ^ = tt,;. 



In terms of the chain X observed and the stopping time r and output state S for 
the algorithm, the properties can be interpreted informally as (i) P(t < k\X) < 1; 
(ii) P{t < k\X) ] as k j; (iii) P{t < 00) > 0; and (iv) P{S = i | r < cx)) = vr^. When 
the strengthening 

(iii') lirrik^ooT,jelN]Y.xo,xi,...,XkPxoPxo,xi---Pxt,_uxk(l^kj{xo,xi,... ,Xk) = l 

[interpreted as P{t < 00) = 1] of (iii) holds, we will call the algorithm terminating. 
When (iv) can be strengthened to 
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(iv') yi G [N] \/k > 0: 

= VTi X ^ X] Pa;oPxo,a'i ■ ■ -Pxfc.i.Xfe (?!'fc,j(a:o,a:i, . . . 

[interpreted as the independence P(r < k, S = i) = P{t < k)TTi of r and 5* ~ vr], we say 
that the algorithm is interruptible. An interruptible algorithm can be aborted without 
biasing output; see the discussion in p. For active-case algorithms, the leading exam- 
ple of a non- interruptible algorithm is coupling from the past [11|, while interruptible 
algorithms include cycle popping [|^, Fill's rejection-based algorithm |^] and the 
Randomness Recycler |^]. The results of this paper, both positive and negative, are for 
interruptible algorithms. 



4 A terminating algorithm for interruptible exact sam- 
pling in the passive case 

In this section we present a terminating algorithm for interruptible exact stationary 
sampling in the passive case, assuming that one can watch N synchronized copies Xi = 
: t = 0,1,...), i € [N], of a Markov chain with state space [A^] and irreducible 
transition matrix P. We allow arbitrary initial distribution p for the A^-variate chain 
X := (Xi,... ,Xj\f), but we assume that Xi,... ,X]y are conditionally independent 
given the initial state (Xi(0), . . . ,Xn{0)). The algorithm will produce an observation 



from the tree distribution corresponding to P (recall Section 3T 



4.1 The algorithm in a restricted setting 

In this subsection we present a terminating algorithm for interruptible exact tree- 
sampling in the passive case that works under the following additional restriction on P: 

Assumption A: pij > for all j € [A^]. 



While this assumption may seem unreasonably restrictive, we will show in Section 4.2 
how a simple modification of the algorithm can handle the more general case. 
To describe the algorithm we first define the following events for even t >2: 



At 
Bt 
Ct 
Dt{T) 

Dt 
Et{T) 
Et 



n,e[7V]{^/(i-2) = l}, 
{Ai(t-l) = l}, 

{{Ai(t),X2(t-l),... ,XNit-l)} = [N]}, 
{the graph with directed edges from Xi {t - 
is the arborescence T}, 

UTgTA(T), 

AtCiBtnCtnDtiT), 

UT(,rEt{T) = AtHBtnCtriDt. 



1) to Xi{t), 2<1<N, 
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Algorithm 4.1 (Terminating interruptible tree-sampling, under Assumption A). 

For even t > 2, let Et{T) and Et be defined as above, and let Eq := 0. The algorithm 
is: 

t ^ 
repeat 

until Et holds 

<— r, for the unique T G T such that Et(T) holds 
return S 



Theorem 4.2. When Asumption A holds, Algorithm 4jJ_ is a terminating algorithm for 
interruptible exact tree- sampling. 

Proof. Let r denote the supremum of the values of the variable t during the operation 
of Algorithm 4.1. Now fix a candidate value t of r. Let T € T be an arborescence, 
say with edges ei directed from ii to j/, which we choose to index (in some arbitrary 
but fixed order) by / G {2, . . . ,N}. The event Dt{T) is a disjoint union of {N — 1)! 
subevents, with each subevent corresponding to a way of mapping the — 1 transitions 
{Xi{t — l),Xi{t)) to the — 1 edges e^. These subevents will all enter symmetrically 
into the calculation below of P{t = t, S = T). One such subevent is 

D[{T) := nfL,{{Xi{t-l),Xi{t)) = {zi,ji)}. 

Let {ii} denote the singleton [N] \ {^2, • • • ,^Af}- 
Define 

At:= {T>t-2}n At, even t > 2. 
Then, using the Markov property and independence of the trajectories, 

P(r = t, S = T) 

= p{{t >t-2}n Et{T)) = P{A't nBtnCtn Dt(T)) 

= (N -1)1 PiA'tDBtn CtDD'tiT)) 



{N-iy.P{A't)P{Bt\Xi{t-2) = l) 



N 



llP{Xi{t-l) = ii\Xi{t-2) = l) 



.1=2 



N 



xP{Xi{t)=ii\Bt 
{N- 1)1 P{A't)pn 



Up 



\l=2 



N 



.1=2 



Pi^^MT) = {N-l)\P{A't)pii [[{P^^ 



w(T). 



Summing over T ^ T we find 



P{r = t) = {N-l)\P{A',)pu (Up^i^ 



w 
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and therefore 



P(r = t, S = T) = P{t = t 



w{T) 



w 



which shows that Algorithm is an interruptible exact tree-samphng algorithm. 
Using the fact that X visits (1, . . . ,1) at even times infinitely often (a.s.) together 
with the strong Markov property of X, it is clear that termination occurs at the first 
success in an almost surely infinite sequence of Bernoulli trials with success proba- 



bility pii (YlfLiPii) w > (note that this is where Assumption A is used). Thus 

□ 



P(r < oo) = 1, that is, Algorithm |4.l| is terminating 



4.2 The algorithm in the general setting 

To avoid needing Assumption A, we can use the averaging technique of Lovasz and 
Winkler Let P'^ be the fe-step transition matrix of the chain X. Then P := 

Jf '^k=i -f*^ irreducible transition matrix with all entries positive. Moreover, we 

can effectively use the original chain to sample from this "averaged" chain. The resulting 



more general algorithm (Algorithm 4.3) obtains, interruptibly, an exact sample T from 
the tree distribution of P. 



To describe Algorithm 4.3, which works in the general setting described at the outset 



of Section ^, for t > 2N we define the following events to be used in the context of the 
algorithm: 



At 
Bt 
Ct 
Dt{T) 

Dt 
Et{T) 
Et 



ni^[N]{Xiit-2N) = l}, 
{Xi{t-2N + Uo) = l}, 

{{Xi{t -2N + Uo + Ui),X2{t -2N + U2), . . . ,XN{t - 2N + Un)} = [N]}, 
{the graph with directed edges from Xi(t — 2N + Ui) 

to Xi{t - 2N + Ui + I), 2 < I < N , is the arborescence T}, 
UreTA(T), 
AtDBtnCtnDtiT), 



In the following algorithm, successive calls to Random () are assumed to generate inde- 
pendent random numbers, each uniformly distributed over [N]. 

Algorithm 4.3 (Terminating interruptible stationary sampling). For t > 2N, 

let Et(T) and Et be defined as directly above, and let Eq := 0. The algorithm is: 



7 



t ^ 
repeat 

t^t + 2N 
for i ^0 to N 

Ui ^ Random () 
until Et holds 

<— T, for the unique T G T such that Et(T) holds 
return S 

By modifying slightly the proof of Theorem |4.2|, we obtain the following result. 



Theorem 4.4. Algorithm 4-5 is a terminating algorithm for interruptihle exact tree- 



sampling. □ 



Remark 4.5. Our interest in providing Algorithm 4.3 is more of a theoretical nature 
(to establish the possibility of terminating interruptible exact sampling, given enough 
copies of a chain) than of a practical nature (to provide an efficient algorithm) . Thus we 



have not fine-tuned Algorithm 4.3 to improve its performance, and we will not analyze 
its running time here. 

Remark 4.6. If we make no assumption regarding the independence of the trajectories, 
then interruptible sampling becomes impossible for N > 2 states, no matter how many 
trajectories are available. Indeed, it is then possible that we are in the extreme case 
that all the trajectories are identical, i.e., that there is "really" only one trajectory, in 



which case Theorem 5.1 applies. 



5 Impossibility of interruptible exact sampling (I) 

Algorithm requires independent synchronized Markov chain trajectories. This 
may seem excessive, especially since for interesting chains A^ is often enormously large. 



But our next main result. Theorem 5.1, shows that this is best possible. Note that to 
prove Theorem 5T, we need only show that interruptible exact sampling is impossible 
using A^ — 1 independent trajectories. Indeed, if interruptible exact sampling is possible 
with m independent trajectories, then for any m' > m it is possible with m' independent 
trajectories, since extra trajectories can always be ignored. 

Theorem 5.1. There is no algorithm in the passive setting for obtaining an observation 
from the stationary distribution of an irreducible aperiodic Markov chain on N states 
which uses fewer than N independent trajectories from the chain and which is both 
interruptible and exact. 



Proof. We first establish an equation [( |5.3| )] that must hold if there exists an interrupt- 
ible exact sampling algorithm for A^-state chains (for given A^ > 2) that uses only a 
single trajectory; in that case the discussion of Section applies verbatim. A similar 
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equation, namely (^.41 ), must hold if interruptible exact sampling is possible using A^ — 1 
trajectories. But (|5.4| ) will lead to a contradiction via a transition-balancing argument. 

So we begin with the case of a single trajectory. Suppose that functions 4>k,i satisfying 
(i)-(iii) and (iv') of Section 3^ exist. We remind the reader that (iii) and (iv') were 
required to hold for all initial distributions p; throughout the present proof it will suffice 
to consider trajectories starting deterministically at 1. Taking p to be unit mass 5i at 1 
and pij to be identically we find from (iii) that, for some < k < oo. 



je[N] XI,... ,Xfc 



0fcj(l,xi, . . . ,Xfc) > (5.1) 



Let ko be the minimum such k, and define 0j(x) = 4>j{xi, . . . , a^fc,,) := (j)k(,j{l, xi, . . . , Xk^) 
and Xj := {x = (xi, . . . , x^q) ■ 0j(x) > 0} for j G [N]. Again taking p to be 5i and pij 
to be identically we find from (iv') and ( |5.1| ) that A'j 7^ for i G [N]. Using (iv') 
again, we find that for any transition matrix P with positive entries and stationary 
distribution tt, 

(5.2) 

and all terms on both sides of (^]^) are positive. Recalling the notation of Section |3.1| , 
it now follows in particular that 

^2 E u^^r'^ = ^1 E '^^(x) u^^r'^' (5-3) 

xeA'i i,j xGA'2 i,j 

where we write nij{x) for the number of i — > j transitions in the trajectory (1, xi, . . . , Xkg) 
and again all terms on both sides of the equation are positive. 

By the same reasoning, if there exists an interruptible exact sampling algorithm for 
A^-state chains that uses — 1 independent trajectories, then there exist integer A: > 
and nonempty sets Xi and X2 of {N — l)-tuples 

X = (xi(l),... ,xi(A;);x2(l),... ,X2{k);... ; xn-i{1), ■ ■ ■ ,XN-i{k)) 

of /c-tuples from [N] such that, for any transition matrix P with positive entries, 

W2 E m^)IIp7^^^ = ^1 E m^)Up7^^^^ (5-4) 

xGA"! i,j xeA'2 i,j 

where, for I = 1,2 and and every x € Xi, we have (pi{x.) > 0, and where nij{x) is the 
sum over l<m<A^ — lof the numbers of i — > j transitions within the trajectories 
(l,Xm(l), . . . ,Xm{k)). To complete the proof, we will show that ( |5.4| ) cannot possibly 
hold. We will make key use of the observation that, for any x € XiU X2, 

< ni+(x) -n+i(x) < A^- 1, (5.5) 
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where we have introduced the notation 

TV 



N 



■■= ^nij(x), n+i(x) := ^nii(x) 
i=2 



(5.6) 



i=2 



for the total numbers of transitions out of and into state 1, respectively. Indeed, since 
each trajectory (1, Xmi^), ■ ■ ■ , Xm{k)) starts in state 1, the number of transitions out of 
state 1 within such a trajectory either equals or exceeds by one the number of transitions 
into state 1. 

To obtain the desired contradiction, we begin by observing that (|5.4| ) can be written 
in the form (eliminating the diagonal variables pu) that 

W2fl = Wlf2 (5.7) 

for all {pij >0: l<i^j< N) such that Ylj-j^iPij < ^ every i G [A^], where 

, «ii{x) 



fl := J2 M^) 



n p^■i 



'ij 



I = 1,2. 



(5.8) 



Using continuity it follows that (|5.7| ) holds for all {pij >0:l<i^j< A^) such that 
Y.j:j^iPij < 1 for every i £ [N]. 

For I = 1,2, note that fi and wi are both polynomial expressions in the variables 
Pij, 1 < ^ 7^ J < A^ (we will denote this entire collection of N{N — 1) variables by p); 
in fact, wi is a polynomial expression in the (A^ — 1)^ variables pij with i,j G [A^] and 
i ^ {l)j} (with a similar reduction in number of variables possible for 1/72). Applying 
Proposition [A.l| (see the Appendix) to F := 1^2/1 — wi/2, we conclude that ( |5.7D holds 
as an equality in the ring of polynomials in the variables p over the complex field. 
Henceforth we shall write Gi = G2 to indicate such an identity of polynomials Gi, G2. 

According to Lemma A.2 in the Appendix, the polynomial wi (again, over the 
complex field) is irreducible; likewise, so is W2- From the polynomial identity 1^2/1 = 
wif2 at (|5.7D it then follows that we can write 



fi = wif, 1 = 1,2, 



(5.9) 



for some polynomial / in p. Of course, the polynomial identities ( ^.9D remain true as 
we now reduce the number of variables to three by setting pij to a for j 7^ 1, pn to (3 
for i / 1, and pij to 7 if i 7^ 1, j 7^ 1, and i 7^ j. Observe that now 



X [1 - (AT - l)a]'^" W [1 _ ^ _ (AT _ 2)7]^W , 1 = 1,2, (5.10) 



recalling (| 



and defining 
'^++(x) : = 



N 



i,je{2,...,N}:ij^j 



H{x) := ^njj(x). 



(5.11) 



i=2 
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Also now, by a simple generalization of the bijection argument ([^, Section 2.3 AA, 
p. 390) showing that the number of arborescences rooted at 1 is N^'"^, 

w;i(/3,7)^/?[/3+(iV-l)7]^-2; (5.12) 

and 

W2ia,f3,'y) = av2iP,'y) (5.13) 

for some polynomial V2{P,"f) which is not divisible by /3 [the explanation for ( [5. 131 ) being 
that any T G 72 has precisely one directed edge leaving vertex 1 and that there exists 
T G 72 for which 1 is a leaf. In fact, it can be shown that f2(/3, 7) = [/? + {N — 1)7]^"^, 
but we won't need this.] 

The idea for the remainder of the proof is to derive from the identities ( |5.9| )-( |5riC| ) 
a polynomial identity in the single variable /?, namely ( p.l4| ), and then show that ( |5.14D 



leads to a contradiction. We will produce ( 5.14 ) by eliminating (using suitable divisi- 



bility arguments) first a and then 7. These arguments are carried out in the next two 
lemmas. □ 



Lemma 5.2. Suppose that there exists an interruptible exact algorithm in the passive 
setting for sampling from the stationary distribution of an irreducible aperiodic Markov 
chain on N states which uses fewer than N independent trajectories from the chain. 
Then there exist nonempty sets X[ and X[' and a polynomial r such that 

^ ,^i(x)/?"+iW-'"i(^)(l - /?)^W = /3^"V(/3), (5.14) 



where 



mi(/3) = min n_|.i(x). 



Proof. Let mi{a) denote the highest power of a that divides fi at ( 5.10 ) and define 
rhl{a) := miUxgA'; ni^^x). We claim that mi{a) = rhi{a), and note that this sort of 
highest-power observation will be used frequently — and without accompanying proof — 
in the sequel. [Indeed, mi{a) > rhi{a) is clear. To see the reverse inequality, divide // 
by a™'*-"-* and set a to to obtain the expression 

J2 0/(x)/3«+iW7"++W[l -P-{N- 2)7]^W =: gi{P,j), (5.15) 

xeXi:ni+{x)=nii (a) 

which is not the zero polynomial since it has a positive value when /? = 1/A^ = 7.] 
By dH), (111), and (Isll ), 

777-2(0) = mi{a) + 1 and (5.16) 
gi = vig, / = 1,2, (5.17) 

where gi is the polynomial defined at ( 5.15| ) [recalling rri;(Q;) = m,;(a)], vi := vji, V2 is 
defined at ( ^.131) , and g is obtained from / by dividing by a™^^"^ and then setting a = 0. 
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Define Xl := {x Xi : ni+(x) = mi{a)} ^ for I = 1,2. Then, similarly, the 
highest power mi{(3) of (3 dividing gi is minxg^t'^' n+i(x); 

m2(/J) = mi(/3) -1; (5.18) 

and, with 

hi{P,j) := 0i(x)/3"+i(^)-"^i('^)7"++(^)[l-/?- (7V-2)7]^W, 

we have 

/ii(/?,7) ^ [/?+ (A^ - lhf-^h{(3,^) (5.19) 

for some polynomial h. 

The highest power mi(7) of 7 dividing /ii is min^g^^' n-|-+(x). Divide both sides 

of ( |5l9| ) by 7'"i('>) and set 7 to to find that (|1|) holds for some polynomial r, where 
^^":={xG^{:n++(x)=mi(7)}7^0. □ 

Lemma 5.3. T/ie identity ( ^.14| ) cannot hold. 

Proof. It follows from (|5.14| ) that n+i(x) > mi(/3) + - 2 for all x G But then, 
for any such x and some x' E ^^2, 

n+i(x) > mi{(3) + N-2 



= m2il3) + N-l by(|l|) 

= n+i(x') + iV-l 

> ni_|_(x') by the second inequality in (|5? 

= 7722(0) 

= mi(a) + l by (|1|) 

= ni+(x) + 1, 

contradicting the first inequality in ([5.51). □ 



6 Impossibility of interruptible exact sampling (II) 



Algorithm [4.3| succeeds in using N independent synchronized Markov chain trajectories 
to carry out interruptible exact sampling. But the algorithm assumes that each of 
the trajectories has not only (i) the same stationary distribution, but also (ii) the 



same transition matrix. In this section we show (Theorem 6T) that interruptible exact 
sampling becomes impossible when assumption (ii) is dropped, no matter how (finitely) 
many trajectories are available. 

Theorem 6.1. There is no interruptible algorithm in the passive setting for obtaining 
an observation exactly from the common stationary distribution of any finite number of 
independent irreducible aperiodic Markov chains on N states. 
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Proof. Let M denote the number of trajectories available. We first prove the impos- 
sibility of interruptible exact sampling when M = N = 2, then more generally when 
N = 2 (regardless of M), and finally for general A^. 
For M = iV = 2, we note that if 

0<pi2,P2i<l and < p < l/max{pi2,P2i}, (6.1) 

then 

\ P21 1 - P21 J V PP'i^ 1 ~ PP'i^ 

are irreducible aperiodic transition matrices with common stationary distribution 

P2l Pl2 



IT 



P21+P12 P21+P12 



Arguing as in the proof of Theorem |5.1| , if there exists an interruptible exact sampling 
algorithm in the present setting, then there exist integer k > 0, nonempty sets Zi and Z2 
of pairs 

z = (x,y) = (2:1, ... ,xk;yi,... ,yk) 

of A;-tuples from {1,2}, and positive numbers ipi{z) (z € Zi, I = 1,2) such that, when- 
ever (|6.l|) holds, 

P12/1 = P21/2 (6.2) 



where, using transition-count notation riij like that in the proof of Theorem 5.1 



fl = V'Kz)Pif ^^Vsf -P12)"-W(1 -P2l)'^-W 

x(l-ppi2)""(^)(l-pP2i)"^^(^\ 1 = 1.2. (6.3) 



Using induction on the /9-degree of the polynomial P12/1 — P21/2 and Proposition [A.l| , it 
is easy to show that (|6.2| ) holds as an equality in the ring of polynomials in the variables 
Pi2,P2i,P over the complex field. 
For / = 1,2, let 

mi = min{[ni2(y) -|- f^2l(y)] : z = (x, y) G Zi for some x} 



denote the highest power of p that divides fi. Then, by ( p. 21) , 7712 = mi. Divide both 
sides of ( |6.2| ) by p"^^ and then set p to to obtain 

^1251=^2152, (6.4) 

where 
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5] Mz)Pi^'^'^P2r^'\i-Pi2r^^''Hi-P2ir'^''\ 1 = 1,2, 
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with Z[ := {z = (x,y) G Zi : ni2(y) + n2i(y) = mi} / 0. But [cf. Q with N = 2], if 
z = (x, y) G U then ni2(y) = [mi/2] and n2i(y) = [mi/2\. Dividing both sides 
of ( |67^ ) by ^[^^''^^2]"^''^^ obtain the polynomial identity 



where 



with 



puhi = p2i/i2, (6.5) 



0KxK2"^"V2f ^"^(1 -Pl2)""("Hl -P2l)"-(^), ^ = 1,2, (6.6) 



:= {x : there exists y such that (x, y) G Z[} 7^ 0, I = 1,2 

and, for x G 

</'/(x) := Yl V'i(x,y)>0. 

y: {x,y)G2;' 

But ( |6.5D is the case = 2 of ( ^.41) , which, as shown in the proof of Theorem 5.1 
cannot hold. This contradiction establishes the theorem in the case M = N = 2. 

We leave to the reader the routine extension of the above proof to the case of 
arbitrary M and = 2. A sketch is that now there are M — 1 parameters pj, but 
by using the same sort of argument for each pj in succession that we used above for p, 
one again obtains a contradiction of the form ( |6.5| ) [with 0/(x) > for all x G A:"; ^ 0, 
; = 1,2]. 

We complete the proof of the theorem by showing that an algorithm for interruptible 
exact sampling using M independent trajectories from chains with A^ > 3 states could 
be converted into one for two-state chains. 

Indeed, while watching independent trajectories of M generic irreducible aperi- 
odic two-state chains Xi,..., Xm with common (unknown) stationary distribution 
TT = (vTi, 7r2), contemporaneously construct M independent irreducible aperiodic A^-state 
chains Yi, . . . , Ym by letting Yi{t) = 1 whenever Xi{t) = 1 and selecting an independent 
uniform random value from {2, . . . , A^} as the value of Yi(t) at each time t such that 
Xi{t) = 2. The stationary distribution for each is (vri, 7r2/(A^ — 1), . . . , 112/ {N — 1)). 
Applying the size- A'' algorithm to Yi, . . . , Ym, suppose the output state is 5". To finish 
the construction of the two-state algorithm, output 5 := min{S',2}. □ 

Remark 6.2. (a) Any two-state chain is reversible, as are the chains Yi constructed in 



the preceding paragraph. Thus Theorem |6.l| remains true even if we assume that the 
chains are all reversible. 

(b) Similarly, as mentioned in Section |l], interruptible exact sampling from the sta- 
tionary distribution is not possible when one observes only a single trajectory from an 
irreducible aperiodic reversible finite-state chain. 



(c) For A'^ > 3 we do not know whether Theorem 5J- remains true if one assumes 
that the chain is reversible. 



Acknowledgment. We thank Dan Naiman for helpful discussions related to the 
Appendix. 
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A Appendix: Polynomials 

In this Appendix we establish two basic facts about polynomials; these were used in the 



proof of Theorem 5.1. Throughout the Appendix, we write F = G to indicate that F 
and G are the same element in the ring of polynomials (in some specified finite collection 
of variables) over the complex field. 

The first fact is quite simple. For completeness, we include an elementary proof. 

Proposition A.l. Let 

X = (xij : 1 < i < n, 1 < j < ki) 

be a double array of variables, where n > and ki > 1 for 1 < i < n. If F{x) is a 
polynomial expression that vanishes whenever Xij > for all i,j and Yl^=i^ij — 1 /o*" 
all i, then F{x.) = 0. 

Proof. Let K := ^"^i ki. The proof is by (strong) induction on k := K + degF, for 
which (if F is not the zero polynomial) the smallest possible value is ^i<i<o 1 + = 0. 
The base case k = of the induction is trivial. 

For the induction step we may assume n > 1 and kn > 1. Dividing the polynomial 
F(x) by Xn^kni we can write 

F(x) = x„,fe„Fi(x)+F2(x') (A.l) 

for polynomials Fi and -F2, where the variables collection x' excludes the single variable 
Xn,kn- Setting Xn,kn to in ( A.l[ ), we see that i^2(x') is a polynomial satisfying the 



hypothesis of the proposition; and (in obvious notation) K2 = K — 1 and deg-F2 < 



deg-F, so that K2 < k. By induction, F2{x') = 0, and so from ( A.l ) we now have 
F(x) = Xn,k„Fi{x.). But now Ki = K and degFi = degF — 1, so that ki = k — 1, and 
one sees that F'i(x) satisfies the hypothesis of the proposition. By induction, Fi(x) = 0; 
we conclude that F{x) = 0, as desired. □ 

As is well known (e.g., Q, Chapter 4), for any n > 1 the ring C[xi, . . . , Xn] of poly- 
nomials in the variables xi, . . . ,x„ over the complex field C is a unique factorization 
domain. This means that every nonzero polynomial in C[xi,... can be written 
uniquely (up to complex scalar multiples) as a (possibly empty) finite product of irre- 
ducible polynomials. (A polynomial is said to be irreducible if it cannot be factored as 
the product of two nonconstant polynomials.) 

Lemma A. 2. The polynomial wi [i.e., the polynomial in the [N — 1)"^ variables pij with 



i,j £ [N] and i ^ {l,i} defined in Section 3A\ is irreducible over the complex field. 



Proof. The proof is by induction on N . For = 1, the polynomial wi = 1 (in no 
variables) is certainly irreducible. For N = 2, the polynomial wi = p2i in the single 
variable p2i is irreducible. To carry out the induction step for > 3, we will use 
another induction, on to prove the following claim. 

Claim. For 3 < / < A^ + 1, let yi denote the polynomial in (A^ - 1)^ - (A^ + 1 - I) 
variables obtained from wi by setting pmi to for I < m < N . Then yi is irreducible 
for 4 < / < A^ + 1. 
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To prove the claim, we begin by noting that has the factorization 

y3=P2iuJ2, (A.2) 

where the polynomial 

W2 = i02i{Pij :2 <i,j < N audi ^ {2,i})) 

is obtained from the polynomial wi for the state space [N — 1] by changing each variable 
name from pij to pi+ij+i. By the induction hypothesis for our A^-induction, uj2 is irre- 



ducible. Since p2i is clearly irreducible, we conclude that (A.2) is a prime factorization 
of ys- 

We now treat the base case / = 4 of our /-induction. Observe that ^ (consider, 
e. g., the tree A^— >A^ — l^---2— i-1) and that y^ is linear in p^i. If is reducible, 



then we can write 



yi = {91P31+ 92)93, (A. 3) 



where gi is a polynomial free of the variable p^i {i = 1, 2, 3) and 53 is nonconstant. If we 
now set psi to in ([A.3|), the result is y^ = 9293- From the prime factorization (A.2) we 



conclude that either p2i or lv2 divides g^. But this is wrong: (i) p2i does not divide (73 
because it clearly does not divide y^ (consider, e. g., the tree A^— >A^ — 1— >---4^ 
2 ^ 3 — > 1), and (ii) i02 does not divide 53 because (we claim) it, too, fails to divide 7/4. 
(Indeed, setting pm2 to for 3 < m < causes 0^2 — but clearly not 2/4 — to vanish.) 
From this contradiction we conclude that 7/4 is irreducible, establishing the /-induction 
base case. 

For the /-induction step, let / > 5. If yi is reducible, then we can write 

yi = {hiPi-1,1 + /i2)/i3, (A.4) 
where hi is a polynomial free of the variable pi^i^i {i = 1, 2, 3) and /13 is nonconstant. If 



we now set pi~i^i to in ( |A.4|) , the result is yi^i = h2h^. By the /-induction hypothesis, it 
must be that /13 is a nonzero complex scalar multiple of yi-i\ from ( [A.4| ) we then deduce 
that yi-i divides yi. But this is wrong, because setting pmi to for 2 < m < / — 2 
causes yi-i — but clearly not yi — to vanish. From this contradiction we conclude that yi 
is irreducible, completing the /-induction. 

Finally, set / to A^ -|- 1 in the claim to find that wi = ^at+i is irreducible, completing 
the A^-induction and the proof of the lemma. □ 



